Increasing the generalization capability of Discriminative Training (DT) of Hidden Markov Models (HMM) has recently gained an increased interest within the speech recognition field. In particular, achieving such increases with only minor modifications to the existing DT method is of significant practical importance. In this paper, we propose a solution for increasing the generalization capability of a widely-used training method \u2013 the Minimum Classification Error (MCE) training of HMM \u2013 with limited changes to its original framework. For this, we define boundary data \u2013 obtained by applying a large steep parameter, and confusion data \u2013 obtained by applying a small steep parameter on the training samples, and then do a soft interpolation between these according to the number points of occupancies of boundary data and the number points ratio between the boundary and the confusion occupancies. The final HMM parameters are then tuned in the same manner as in MCE by using the interpolated boundary data. We show that the proposed method achieves lower error rates than a standard HMM training framework on a phoneme classification task for the TIMIT speech corpus.
展开▼